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Abstract 

A cellular automaton in which cells represent agents playing the Prisoner's 
Dilemma (PD) game following the simple "win-stay, loose-shift" strategy is studied. 
Individuals with binary behavior, such as they can either cooperate (C) or defect 
(D), play repeatedly with their neighbors (Von Neumann's and Moore's neighbor- 
hoods). Their utilities in each round of the game are given by a rescaled payoff 
matrix described by a single parameter r, which measures the ratio of temptation 
to defect to reward for cooperation. Depending on the region of the parameter space 
r, the system self-organizes - after a transient - into dynamical equilibrium states 
characterized by different definite fractions of C agents Cqo (2 states for the Von 
Neumann neighborhood and 4 for Moore neighborhood). For some ranges of r the 
cluster size distributions, the power spectrums P(f) and the perimeter-area curves 
follow power-law scalings. Percolation below threshold is also found for D agent 
clusters. We also analyze the asynchronous dynamics version of this model and 
compare results. 

keybords: Complex adaptive systems, Sociophysics, Econophysics, Agent-based models, 
Self-organized criticality. 
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1 Introduction 



The Prisoner's Dilemma (PD) game plays in Game Theory a role similar to the harmonic 
oscillator in Physics. Indeed, this game, developed in the early fifties, offers a very simple 
and intuitive approach to the problem of how cooperation emerges in societies of "selfish" 
individuals i.e. individuals which pursue exclusively their own self-benefit. It was used in 
a series of works by Robert Axelrod and co-workers [1] to examine the basis of cooperation 
in a wide variety of contexts. Furthermore, approaches to cooperation based on the PD 
have shown their usefulness in Political Science [2]- [4], Economics [5]- [11], International 
Affairs [12]-[15], Theoretical Biology [16]-[18] and Ecology [19]-[20]. 

The PD game consists in two players each confronting two choices: cooperate (C) 
or defect (D) and each makes his choice without knowing what the other will do. The 
four possible outcomes for the interaction of both agents are: 1) they can both cooperate: 
(C,C), 2) both defect: (D,D), 3) one of them cooperate and the other defect: (C,D) or 
(D,C). Depending on the case l)-3), the agents get respectively : the "reward" R, the 
"punishment" P or the "sucker's payoff" S the agent who plays C and the "temptation 
to defect" T the agent who plays D. These four payoffs obey the relations: 



The last condition is required in order that the average utilities for each agent of a 
cooperative pair (R) are greater than the average utilities for a pair exploitative- exploiter 
((R + S)/2). One can assign a payoff matrix M to the PD game given by 



which summarizes the payoffs for row actions when confronting with column actions. 
Clearly it pays more to defect: if one of the two players defects -say %- , the other who 
cooperates will end up with nothing. In fact, even if agent i cooperates, agent j should 
defect, because in that case he will get T which is larger than R. That is, independently 
of what the other player does, defection D yields a higher payoff than cooperation and is 
the dominant strategy for rational agents. Furthermore, is the Nash equilibrium [21] - i.e. 
a best reply to itself - of the PD game. The dilemma is that if both defect, both do worse 
than if both had cooperated: both players get P which is smaller than R. A possible way 
out for this dilemma is to play the game repeatedly. In this iterated Prisoner's Dilemma 
(IPD), there are several strategies that outperform the dominant [D,D] one-shot strategy 
and lead to some non-null degree of cooperation. 

The attainment of cooperation in PD simulations relies on different mechanisms and 
factors. A popular point of view regards direct reciprocity as the crucial ingredient. A 
typical exponent of this viewpoint is the strategy known as Tit for Tat (TFT) : cooperate 
on the first move, and then cooperate or defect exactly as your opponent did on the 
preceding encounter. This requires either memory of previous interactions or features 
("tags") permitting cooperators and defectors to distinguish one another [22]. 



T> R> P> S 
and 

2R> S + T. 



(1) 
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Spatial structure has also been identified as an influential factor in building coop- 
eration. For instance, in ref. [23] the authors neglected all strategical complexities or 
memories of past encounters. Instead, they show that spatial effects by themselves in a 
classic Darwinian setting are sufficient for the evolution of cooperation 1 . 

The problem of cooperation is approached mainly from an Darwinian evolutionary 
perspective: strategies that incorporate some dose of cooperative behavior are the most 
successful and propagate displacing competing strategies that do not. In that sense, a 
central concept is that of evolutionary stable strategy (ESS) [24], [25]: a strategy which if 
adopted by all members of a population cannot be invaded by a mutant strategy through 
the operation of natural selection. The evolutionary game theory, originated as an appli- 
cation of the mathematical theory of games to biological issues, later spread to economics 
and social sciences. 

In this work, we follow a different approach: there is no competition of different 
strategies, all the agents follow a natural strategy of "win-stay, loose-shift" known as 
Pavlov [26]. We do not worry about the resistance of the strategy against invasion by 
other strategies (like unconditional defectors or ALL D that play D independently of what 
the opponent does), rather we take Pavlov for granted. The rationale for this relies on 
several facts. First, Pavlov seems to be a widespread strategy in nature [27]. Second, 
Pavlov does pretty well when competing with several other strategies including generous 
tit-for-tat GTFT 2 as it was shown by Nowak and Sigmund [28]. Moreover, they found 
that in a non-spatial setting while Pavlov can be invaded by ALL D a slightly stochastic 
variant cannot. Third, experiments with humans have shown that a great fraction of 
individuals indeed use Pavlovian strategies [29]. 

Therefore, we address the analysis of the self-organized states that emerge when simple 
agents, possessing neither long term memory nor tags, play the PD game in a spatial 
setting using Pavlov strategy. We this aim we resort to a cellular automaton in which 
each cell is either black or white representing, respectively, a D or a C agent. Each agent 
plays with those belonging to his neighborhood, and the total utilities he gets determine 
the update of his individual state. 

We consider payoff matrices implying strict dilemmas defined by equations (1) rather 
than weak ones in which the inequalities are relaxed (for instance P = S). To simplify 
things we parameterize the payoff matrix in terms of a single parameter r, which measures 
the ratio of temptation to defect to reward for cooperation. 

Different self-organizations occur depending on the value of r, the type of dynamics 
and the considered neighborhood. In particular, for a range of values of r (that depends 
on the neighborhood) we found power law behavior that might be a signature of self- 
organized criticality [30]. 

Previously, a non spatial similar model, in which pairs of agents were chosen at ran- 
dom, was analyzed in ref. [31]. Also, a Mean Field stochastic version was considered in 
[32]. 

This work is organized as follows. In section 2 we describe the model. In section 3 we 

1 Indeed the game they considered is not exactly the PD and implies a "weak dilemma" in which D 
does not strictly dominate. 

2 GTFT cooperates after the opponent has cooperated in the previous round, but it also cooperates 
with a non null probability after the opponent has defected. 
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present the results of simulations as well as analytical results obtained by using a Mean 
Field approximation that neglects all spatial correlations (details in the appendix at the 
end). Section 4 is devoted to conclusions and final remarks. 



The model is vey simple: we assign to each agent, located at the cell with center at (x, y), 
a binary behavioral variable c(x,y) which takes the value "1" for C agents and "0" for D 
agents. This agent plays with the z agents belonging to his neighborhood N(x, y) getting 
a payoff U±(x,y) with the first neighbor he plays, U2(x,y) with the second one and so 
on 3 . The total utilities U(x,y) = Ui(x,y) + U2(x,y) + ... + U z (x,y) he gets playing 
with his neighborhood determine the update of his individual state. More technically, we 
have an outer totalistic cellular automaton i.e. the state of a cell at the next time-step 
depends only on its own state, and the sum of the states of its neighbors. The dynamic is 
synchronous: all the agents update their states simultaneously at the end of each lattice 
sweep. In addition to this synchronous dynamics or "parallel updating" we also explored, 
with less detail, the asynchronous dynamics or "sequential updating", in which the state 
of an agent is updated after he played. 

We considered two different neighborhoods: a) the von Neumann neighborhood (z — 4 
neighbor cells: the cell above and below, right and left from a given cell) and b) the Moore 
neighborhood (z — 8 neighbor cells: von Neumann neighborhood + diagonals). 

The payoff matrix is parameterized in terms of a single parameter r = T/R: 



with r > 1. The total utilities of the agent at (x,y) at time t, U(x,y,t), are the sum of 
the utilities collected by playing with each of his neighbors, as prescribed by the payoff 
matrix. 

A typical value for the population of agents is N ag = 10, 000 (100 x 100 lattice) 4 . 

The initial state at t — is taken as c(x,y; 0) = or 1 (D or C respectively), chosen 
at random for each cell (x,y). Then the system evolves by iteration during tf time steps 
till it reaches a stationary or dynamical equilibrium state. 

Pavlov's strategy works as follows. The agent at (x, y) will change his state for the 
next time step t+1: c(x, y,t+l) = l — c(x, y, t) (from C to D or viceversa) if U (x, y, t) < 0, 
and will remain the same: c(x,y,t+ 1) = c(x,y,t), if U(x,y,t) > (when U(x,y,t) = 
the agent changes with probability 0.5). Once all the agents have played, their state is 
updated for the next time iteration. 

For the von Neumann neighborhood then, each agent plays with his four nearest 
neighbors. Let's analyze what is expected to happen for different values of the parameter 
r. Let's focus on the agent at (x, y) and his possible configurations (C or D) and the ones 

3 The order in which a given agent plays with his neighbors doesn't matter, it can be fixed or randomly 
chosen 

4 However, in some cases we considered N ag up to 1,000,000 (1000 x 1000 lattice) in order the transients 
become long enough to extract the power spectrum. 



2 The Model 




(2) 
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of his neighborhood (number of C and D neighbors) and in each case his corresponding 
utilities. These results are shown in Table 1: 





4C, OD 


3C, ID 


2C, 2D 


1C, 3D 


OC, 4D 


c 


4 


3-T 


2-2r 


l-3r 


-4r 


D 


4r 


3t-1 


2r-2 


r-3 


-4 



Table 1. Utilities of a given agent depending if his state is C (row 1) or D (row 2) and 
the states of his neighborhood (columns 2 to 6) for von Neumann neighborhood. 

From Table 1, since r > 1, we observe that the sign of the utilities U(x,y) of the 
agent located at site (x, y) -which determines the update of his c(x, y)- depends on the 
value of r only for two ) if the agent plays C and his neighborhood consists in 

3C agents and ID or b) if the agent plays D and his neighborhood consists in 1C agent 
and 3D agents. In both cases the update rule depends thus whether r > 3 or r < 3. 
So, a priori, one would expect the existence of a "critical" value of the parameter r* = 3 
such that the results depend on whether r is greater or smaller than this critical value. 
Intuitively one can argue that since for r > 3 there are more favorable situations for 
D agents and disfavorable for C agents, the mean cooperation of the system when the 
dynamical equilibrium is reached, = — ^- J2N ag c { x ,U,t) -after the transient-, will be 
smaller than when r < 3. 

Table 2 summarizes the utilities of a player for each possible configuraqtion of his 
neighbors for the case of Moore neighborhood. 





8C, OD 


7C, ID 


6C, 2D 


5C, 3D 


4C, 4D 


3C, 5D 


2C, 6D 


1C, 7D 


0C,8D 


c 


8 


7-T 


6-2r 


5-3r 


4-4r 


3-5r 


2-6r 


l-7r 


-8r 


D 


8r 


7r -1 


6r-2 


5r-3 


4r-4 


3r -5 


2r-6 


T-7 


-8 



Table 2. The same as Table 1 but for Moore neighborhood. 

A completely analogous reasoning for the Moore neighborhood leads to three " critical" 
values: r* = 5/3, = 3 and r| = 7. Here we would expect also that Coo will diminish as 
r crosses each frontier value r* from left to right. 



3 RESULTS 

To avoid dependence on the initial conditions the measures correspond to averages taken 
over an ensemble of 100 systems with arbitrary initial conditions. In general, the results 
for the asymptotic regime, after a transient, become almost independent of the lattice 
size L for L > 100. Therefore in what follows, unless it is stated otherwise, the results 
correspond to simulations performed in 100 x 100 lattices. 

As we have anticipated, we observe that the stationary state of the system changes 
as the parameter r moves from one region to another (two regions in the case of z = 4 
von Neumann neighborhood and four regions for z = 8 Moore neighborhood). 
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3.1 Asymptotic average fraction of cooperators Cqo 

The asymptotic or equilibrium mean fraction of C-agents c^ 5 , takes constant values in 
each of the regions delimited by the "critical" r*. Hence we have one sharp step at Coo = 3 
for z = 4 and three sharp steps at Coo = |, 3 & 7 for z — 8. 

It is interesting to compare the Cqo, produced by simulations, with the c^ F obtained 
by elementary calculus using a Mean Field (MF) approximation that neglects all spatial 
correlations (see APPENDIX I). 

In Tables 3 and 4 we present the and c^ F for z = 4 and z = 8 respectively Clearly, 
as expected, the MF approximation improves increasing z. In addition, divergences be- 
tween spatial games and the MF approximation become maximum in the "cooperative" 
sector of the parameter r (leftist region, producing > 0.5 ). This can be explained in 
terms of the particular cluster structure of that region exhibiting power law scalings (see 
next subsection). 



z=4 


Simulations 


MF 


T < 3 


0.485 ±0.002 


0.430 


T > 3 


0.280 ±0.002 


0.342 



Table 3. The asymptotic fraction of cooperators for z = 4 von Neumann 
neighborhood. Column 2: simulations. Column 3: MF approximation (see APPENDIX 

I)- 



z=8 


Simulations 


MF 


1< t < 5/3 


0.563 ±0.002 


0.461 


5/3 < r < 3 


0.436 ± 0.002 


0.420 


3 < t < 7 


0.366 ±0.003 


0.386 


8 < T 


0.320 ±0.003 


0.334 



Table 4. The asymptotic fraction of cooperators Cqo for z = 8 More neighborhood. 
Column 2: simulations. Column 3: MF approximation (see APPENDIX I). 

3.2 Spatial Patterns: The Cluster Structure 

Von Neumann neighborhood 

In Fig. 1 we present snapshots -after the transient- of the cellular automaton for 
r < 3 and r > 3. These "cooperation maps" illustrate the differences between the typical 
spatial patterns that arise in the two parameter regions divided by t* = 3. 

For r < 3 we found that: 

I) Although the asymptotic probability for D agents is doo = 1 — Cqo ~ 0.5, which 
is below the percolation threshold p c ~ 0.59275, giant spanning D clusters often occur. 
Percolation below threshold is a known fact in other models. In general, when there are 
correlations between the sites, the threshold is shifted. As it happens, for instance, in the 
square Ising model percolation occurs, at the critical temperature, when the concentration 
is also 0.5. 



5 The upper bar in denote an average over 100 simulations with different initial conditions. 
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(a) 



(b) 




20 40 60 80 100 20 40 60 80 100 

Figure 1: Asymptotic "cooperation maps" for: (a) r < 3, (b) r > 3. Black=D, white=C. 



II) Different quantities behave as power laws implying thus the emergence of scale 
free phenomena. For instance, the size distribution of clusters of D agents exhibits power 
law scaling. 

For t > 3 the distribution of D clusters is bimodal with a peak for very small clusters 
(size=l) and a secondary peak for very large clusters. The main peak for very small clus- 
ters can be explained by the small correlation length. On the other hand, the secondary 
peak for very large sizes arises because the probability of a given site to be in the D state 
doo = 1 — Cqo is over the site percolation threshold and thus spanning clusters are much 
more abundant than when r < 3 in which case doo < p c . 

Fig. 2 is a plot of the log of the number of clusters of C and D agents vs. the 
log of their size for r < 3 and r > 3 using 400 x 400 lattices. In both cases giant 
spanning clusters of D agents were excluded. This, in particular for r > 3, eliminates a 
large number of clusters belonging to the secondary peak of its bimodal distribution and 
explains why there are less "+" points in Fig. 2- (b) than in 2-(a) (the shortage of "*" 
points, representing C clusters, obviously is related to the fact that Cqo is smaller on the 
r > 3 side). 



The data points for D clusters seem consistent with a power law scaling over a couple 
of decades, with a critical exponent of approximately — 1.79±0.02. The graphic also shows 
a difference between C and D clusters: the first ones exhibit much greater deviations from 
an exact power law although they also occur over a wide range of scales. This asymmetry 
can be traced to the difference that exists for the possible stable configurations of clusters 
of C's or D's; while the first ones need at least three C neighbors to remain C, the second 
ones can do well with only two C neighbors. Then the D agents can form thinner clusters 
than the C agents. This fact increases the probability of agents D to yield larger clusters. 
This also can explain why although the equilibrium probability for D agents is below the 
percolation threshold, giant spanning D clusters are observed. 
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Figure 2: Number of clusters of C (*) and D (+) agents vs. size of the clusters for the von 
Neumann neighborhood in a 400x400 lattice. The clusters are summed over the last 150 
lattice sweeps after the transient for: (a) r < 3, (b) r > 3. In both cases giant spanning 
D clusters were not included 



For r > 3 the situation changes drastically as Fig. 2.(b) reflects, here it can be seen 
that the data don't fit well with a power law neither for D nor for C clusters. 

Remark - To check that the power law scaling is not dependent on the particular 
parameterization of the payoff matrix we are using, we measured the cluster distribution 
for many other payoff matrices not described by (2). For instance, we considered this 
alternative parameterization of the payoff matrix 

M>-( (M) W-Z'T)) ( 3 ) 
~ UW2-3) (-1,-1) J' {S) 

with 3 — r/2 < —1 < 1 <r Again, we found power law behaviour for the leftist region in 
r. Thus, it seems that this power law scaling for an entire collection of PD payoff matrices 
is a robust property of the model. 

Another clue about the dynamics of the clusters can be obtained by examining the 
relation of the perimeter to the area of the clusters. We define the perimeter of a cluster C 
(D) as the set of sites (x, y) with behavioral variable c(x, y) — 1 (c(x, y) = 0) belonging to 
the cluster with at least one neighbor with the opposite behavioral variable i.e. c(x, y) — 
(c(x,y) = 1). The mean perimeter P(A), for a given area A, is then given by averaging 
over all the perimeters of clusters with area A. Fig. 3 shows that for r < 3 the perimeter 
scales linearly with the area, that is, at the fastest rate possible, implying that the clusters 
are highly ramified. The fraction of the area that is interior to the clusters can be easily 
calculated. 

By fitting the point of Figs. 3. (a) and (b) we get the following expressions for the 
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(a) x < 3, C clusters 



(c) t > 3, C clusters 







P=0.82*A + 1.4 


* 



200 400 600 800 

(b) t < 3, D clusters 




1000 2000 3000 4000 5000 



30 
20 
10 

o 

i 

5000 
4000 
3000 
2000 
1000 
0- 




P = 1*A - 0.0012 



10 20 30 40 50 
(d) t > 3, D clusters 



2000 4000 6000 8000 

Area 



Figure 3: Perimeter vs. area of the clusters of C and D agents for z=4. The perimeter's 
values plotted are averages of perimeters of clusters of the same size, taken over the last 
500 lattice sweeps after the transient. 



perimeter as a function of the area, for r < 3: 

Pc ~ 0.82Ac for clusters of C agents , . 

Pd ~ 0.86Ad for clusters of D agents. ^ ' 

Then the cluster interior fraction is F = ^j^-- Thus we get that approximately: 

F c c± 0.18 for clusters of C agents , ■ 

F D ~ 0.16 for clusters of D agents 

This shows that the clusters have almost no interior, and confirms our previous obser- 
vation concerning that the clusters of D agents are thinner than those of C agents. This 
supports quantitatively the explanation of why percolation of D agents is observed but 
no of C agents. The linear behavior shown in Fig.3.(c), which slope approximatelly equal 
to 1, can be understood by inspection of Fig.l.(b) where is clearly seen that the C agents 
form small "laddered" clusters in which the perimeter is equal to the area. 

Moore neighborhood 

For arbitrary random initial conditions, the equilibrium cooperation maps are shown 
in Fig. 4 for r in the different regions of interest. 

As it can be seen from Table 4, when r is within the interval (1, |), Cqo ~ 0.6 which 
is higher than the values obtained for the von Neumann neighborhood for any r. This 
implies that increasing the number of neighbors in general produces a higher fraction of 
cooperators, although this higher value of Coo is stable for narrower domain values of r. 
We checked this for the case in which 12 neighbors are taken into account, achieving a 
value of Coo — 0.8 for r G (1, |). 
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(a) (b) 




20 40 60 80 100 20 40 60 80 100 



Figure 4: Cooperation maps for Moore neighborhood at equilibrium (after 10 5 iterations) 
for: (a) r G (1,§), (b) r G (f,3), (c) r G (3,7) and (d) r > 7. Black=D, white=C. 



Let us analyze what happens to the clusters of C's and D's for the different values of 
r, this time for the Moore neighborhood. The results are shown in Fig. 5. 

In Fig. 5. (a), corresponding to r G (1, |) and Cqo ~ 0.57, we can observe power law 
behavior for both clusters of C and D agents, with the same critical exponent of ap- 
proximately — 1.62 ± 0.02. This symmetry between C's and D's is broken when we take 
r G (§, 3) (Fig.5.(b), ~ 0.44): here we recover the kind of behavior we found for t < 3 
in the case of the von Neumann neighborhood (see Fig. 2. (a)), for which the power law 
scaling for D agents is much more claear than for C agents. In this case we find an ex- 
ponent of approximately —1.98 ±0.04. Remarkably, criticality seems to persist, although 
not so clearly as in the previous cases, even for values of r in the interval (3, 7) (Fig.5.(c)). 
For r > 7, power law behavior is completely lost, as Fig.5.(d) shows. 



3.3 Power Spectrums 

The power laws we found for spatial observables might be interpreted as signatures of self- 
organized criticality (SOC). In order to elucidate the criticality or not of the dynamics we 
analyzed temporal correlations. Specifically, we calculated the power spectrum P(f) (i.e. 
the absolute value of the Fourier transform) of the time autocorrelation function G(t) of 
the cooperative fraction c(t). G(t) is defined as: 

G(t) =< c(t )c(t + t)>-< c(t ) > 2 , (6) 
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Figure 5: Number of clusters of C(*) and D(+) agents vs. size of the clusters, summed 
over the last 500 times after 10 4 iterations for z = 8, in logarithmic scale. The plots 
correspond to: (a) r G (1, §), (b) r G (|,3), (c) r G (3,7), (d) r > 7. There is a 
percolation peak for clusters of D agents in (b), (c) and (d) since they are above the 
percolation threshold (d > p c ). 
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Figure 6: Power spectrum for z = 4 Von Neumann neighborhood : (a) r < 3, (b) r > 3. 



where the average is taken over all possible temporal origins to- 

It turns out that although the transients are not very long, P(f) exhibits power law 
behavior, for the same range of values of r we found this type of behavior for the cluster 
size distributions, for almost two decades. For instance, in the case of the von Neumann 
neighborhood, we have a power law power spectrum for r < 3 which is lost for r > 3 
(which is consistent with the fact that the simulations have shown that for this region the 
system behaves periodically, with a very short period). This is shown in Fig. 6. 

The correlation function G(t) is calculated for the transient. In order to maximize 
this transient an initial c(t = 0) =0.1 very different from the known equilibrium value of 
Coo — 0.5 was taken together with a large lattice of 1000 x 1000. This power law scaling 
of P(f), for the same region we found this type of behavior for the cluster sizes, can be 
interpreted as another signature for the possible existence of critical dynamics. 

3.4 Asynchronous dynamics 

As we mentioned in the previous section, besides exploring the synchronous dynamics, we 
also performed some runs using the asynchronous dynamics, in which the state of each 
agent is updated after he played with his neighborhood. 

The asynchronous update produce a much less interesting situation. The power laws 
are lost, both for the von Neumann and Moore neighborhoods: we find no power laws 
for the cluster sizes nor for the power spectrum and the cooperation values decrease 
significantly. Still, there is a change in the mean value of the cooperation as the parameter 
r goes through the critical values calculated earlier. For the von Neumann neighborhood, 
for r < 3, Coo — 0.34. For r > 3 cooperation decreases to Coo — 0.23 and there is no clear 
pattern of behavior. For the Moore neighborhood results are similar, with Cqo — 0.34, 



11 



0. 30, 0.21 and 0.13 for r e (1, §), (§, 3), (3, 7) and r > 7 respectively. 

4 CONCLUSIONS 

For a cellular automata, representing a system of agents playing the IPD governed by 
Pavlovian strategies in a simple territorial setting, we explored its steady states for dif- 
ferent values of the parameter r , which measures the ratio of temptation to defect to 
reward. Both for the Von Neumann and Moore neighborhoods we found sharp steps for 
Cqo vs. r (one step in the first case and three steps in the second case). 

We found power-law scaling for different quantities, measuring either spatial (cluster 
size distributions) or temporal correlation (P(/)), for entire regions in parameter r space. 
All this may be interpreted as consistent evidences of self-organized criticality in a spatial 
game which is not evolutionary (at least in the ordinary Darwinian sense). This result, 
which is qualitatively robust against changes of the payoff matrix and the neighborhood, 
is novel (as far as we know). [It is worth to mention that the parameterization (3) allows 
to study two other games besides the PD: If — 1 < r < 1 (R > T > P > S) the game 
is known as "Stag Hunt" (SH) while when 4 < r < 8 (T > R > S > P) the game 
is called the "Hawk-Dove" (H-D). We simulated these two games, which are popular in 
Social Sciences and Biology respectively, and, in contrast to what happen with the PD, we 
found no power law behavior [35].] On the other hand, the occurrence of critical dynamics 
in certain spatial evolutionary games has been observed. For instance, in ref. [33] it was 
shown that for certain range of a parameter, which determines the punishment, the spatial 
HD game exhibits large temporal and spatial correlations and various processes governed 
by power-laws. This is in contrast with the simplified version of the PD considered in ref. 
[23], which does not exhibit complex critical dynamics of this type, rather it has periodic 
or chaotic dynamics. Nevertheless, for a stochastic version of this evolutionary weak 
dilemma, power law behavior consistent with directed percolation has been measured 
[34]. 

We also have shown that percolation below the threshold value occurs for D-agents for 
the case of the von Neumann neighborhood. The asymmetry between C and D clusters, 
even in cases in which both types of agents appear with equal probability, can be explained 
in terms of the Pavlovian strategy and the asymmetry of the payoffs (see Table 1). 

A result worth remarking is that the degree of cooperation can be increased by enlarg- 
ing the neighborhood but, simultaneously, the temptation parameter r must be restricted 
to smaller values. 

Another interesting general result is the effect of changing the dynamics from syn- 
chronous to asynchronous. The scale invariance we found for the synchronous update 
disappear when we turn to the asynchronous update. The fact that the general quali- 
tative behavior of asynchronous models may differ greatly from that of the synchronous 
version was noticed in [36]. 

Let us mention some interesting future extensions of the work presented here. For 
instance, we observed that for small lattices this simple deterministic system often reaches 
true equilibrium configurations, in which all the agents are happy (all get utilities above 0) 
and do not change their respective states. In other words, Pareto Optimal states (POS) 

1. e. states in which none of the players can increase their payoff without decreasing the 
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payoff of at least one of the other players. In Fig. 7 an example of such equilibrium states 
is presented for a small (6 x 6) lattice, z = 4 and r = 2. 
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Figure 7: Pareto Optimal states configuration for a small 6x6 lattice, z = 4 and r = 3. 
Left: the c(i,j) matrix. Right: the corresponding utilities U(i,j): the utilities for all the 
agents are positive and thus they don't change their behavioral variables. 



When the lattice size grows the system becomes unable to reach these POS. The 
explanation we found for this is , as the size grows, the fraction of POS with respect 
to the possible configurations decreases. Additionally, it is plausible that the entirely 
deterministic update does not provide a path in configuration space connecting the initial 
state with an POS. The introduction of noise in the update rule, in some particular 
cases, might help promoting ergodicity. The effect of the introduction of noise in spatial 
evolutionary games was analyzed for example in [38] and [39]. An interesting goal is 
how to use noise to avoid entrainment in non efficient states i.e. to implement a sort of 
simulated annealing approach [37] allowing to reach these optimal equilibriums. 

Another issue that seems worth exploring is the extension of the present approach, 
beyond the PD game, to games that are useful to model other different everyday situations, 
like the "Stag Hunt", "Chicken", etc [35]. 

Finally, after we concluded this manuscript, one of the referees pointed out the study 
of the PD game of Posch et al [40] using "win-stay, lose-shift" strategies in a non spa- 
tial setup. This work offers an stimulating discussion of when can satisficing become 
optimizing. 
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Estimate of can be obtained by elementary calculus using a Mean Field approxi- 
mation that neglects all spatial correlations. 

Once the stationary state was reached, the transitions from D to C, on average, must 
equal those from C to D. Thus, the average probability of cooperation Cqo is obtained 
by equalizing the flux from C to D, J C d, to the flux from D to C, Joe- The possible 
utilities for a C player range from R = zxl = ztoS = —zr (see Table 1 and Table 2). 
Let us consider by separate the z = 4 von Neumann neighborhood and the z = 8 Moore 
neighborhood. 

z = 4 

We have two different situations depending on the value of r: r < 3 or r > 3. 

• r < 3: 

In that case, the utilities Uc (Ud) of a C (D) player are negative, and thus he 
changes from C to D (D to C) if at least 2 (3) neighbors play D. For a given average 
probability of cooperation c, the probabilities of a C agent facing 2, 3 and 4 neighbors 
playing D are respectively: c 3 (l — c) 2 , c 2 (l — c) 3 and c(l — c) 4 . Consequently, Jqd 
can be written as: 

Jcd oc c 3 (l - cf + c 2 (l - cf + c(l - cf. (7) 

On the other hand, the probabilities of a D agent facing 3 and 4 neighbors playing 
D are respectively: (1 — c) 4 c and (1 — cf. Therefore Jdc is given by: 

J DC oc c(l - c) 4 + (1 - cf. (8) 

Thus the algebraic equation for is: 

c 3 oo + c 2 <) (l-c oo )-(l-c oo ) 3 = 0, (9) 

with only one real root in the interval [0,1]: c^ F = 0.430. 

• t > 3: 

In that case, the utilities Uc (Ud) of a C (D) player are negative, and thus he 
changes from C to D (D to C) exept (only) if he has all his 4 neighbors playing C 
(D). Therefore, J C d must be modified summing a term c 4 (l — c) to eq. (7) and the 
term c(l — c) 4 must be supressed from the expresion (8) for Jdc- Hence, we get the 
following algebraic equation for Coc,: 

4 + da - coo) + 4(i - Coo ) 2 - Coo (i - Coo ) 3 - (i - Coo ) 4 = o, (io) 

with only one real root in the interval [0,1]: c^ F = 0.342. 

z = 8 

We have four different situations depending on the region in the parameter space r. 
The corresponding polynomials for are obtained exactly as it was done for z = 4 
and one can easely check that are given by: 
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Coo ) (1 Coo ) 



(11) 



(12) 



with only one real root in the interval [0,1]: c, 



,MF 
eq2 



0.420. 



• 3 < t < 7 



4 + 4(1 - Coo) + 4,(1 - Coo) 2 + 4(1 - Coo) 3 + 
4,(1 - Coo ) 4 + 4(1 - Coo) 5 - (1 - Coo)' = o, 



(13) 



with only one real root in the interval [0,1]: c^f = 0.386. 



• 7 < r 



4 + 4(i - coo) + 4(i - coo) 2 + 4(1 - c co) 3 + 4(1 - Coo) 4 + 

4(1 - Coo) 5 + 4(1 - Coo) 6 + Coo(l - Coo) 7 " (1 - Coo) 8 = 0, 



(14) 



with only one real root in the interval [0,1]: c^f = 0.334. 
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